Outlier Processing via L1-Principal Subspace

نویسندگان

  • Shubham Chamadia
  • Dimitris A. Pados
چکیده

With the advent of big data, there is a growing demand for smart algorithms that can extract relevant information from high-dimensional large data sets, potentially corrupted by faulty measurements (outliers). In this context, we present a novel line of research that utilizes the robust nature of L1norm subspaces for data dimensionality reduction and outlier processing. Specifically, (i) we use the euclidean-distance between original and L1-norm-subspace projected samples as a metric to assign weight to each sample point, (ii) perform (K=2)-means clustering over the one-dimensional weights discarding samples corresponding to the outlier cluster, and (iii) compute the robustL1-norm principal subspaces over the reduced “clean” data set for further applications. Numerical studies included in this paper from the fields of (i) data dimesnionality reduction, (ii) direction-of-arrival estimation, (iii) image fusion, and (iv) video foreground extarction demonstrate the efficacy of the proposed outlier processing algorithm in designing robust low-dimensional subspaces from faulty high-dimensional data.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Outlier Processing Via L1-Principal Subspaces

With the advent of big data, there is a growing demand for smart algorithms that can extract relevant information from high-dimensional large data sets, potentially corrupted by faulty measurements (outliers). In this context, we present a novel line of research that utilizes the robust nature of L1-norm subspaces for data dimensionality reduction and outlier processing. Specifically, (i) we us...

متن کامل

Optimal Algorithms for L1-subspace Signal Processing

Abstract We describe ways to define and calculate L1-norm signal subspaces which are less sensitive to outlying data than L2-calculated subspaces. We start with the computation of the L1 maximum-projection principal component of a data matrix containing N signal samples of dimension D. We show that while the general problem is formally NP-hard in asymptotically large N , D, the case of engineer...

متن کامل

Thresholding based Efficient Outlier Robust PCA

We consider the problem of outlier robust PCA (OR-PCA) where the goal is to recover principal directions despite the presence of outlier data points. That is, given a data matrix M∗, where (1 − α) fraction of the points are noisy samples from a low-dimensional subspace while α fraction of the points can be arbitrary outliers, the goal is to recover the subspace accurately. Existing results for ...

متن کامل

Some Options for L1-subspace Signal Processing

We describe ways to define and calculate L1-norm signal subspaces which are less sensitive to outlying data than L2-calculated subspaces. We focus on the computation of the L1 maximum-projection principal component of a data matrix containing N signal samples of dimension D and conclude that the general problem is formally NP-hard in asymptotically large N , D. We prove, however, that the case ...

متن کامل

Nearly Optimal Robust Subspace Tracking and Dynamic Robust PCA

In this work, we study the robust subspace tracking (RST) problem and obtain one of the first two provable guarantees for it. The goal of RST is to track sequentially arriving data vectors that lie in a slowly changing low-dimensional subspace, while being robust to corruption by additive sparse outliers. It can also be interpreted as a dynamic (time-varying) extension of robust PCA (RPCA), wit...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017